Skip to content

oracle_cdc: add benchmark suite#4082

Merged
josephwoodward merged 1 commit intomainfrom
jw/oracledb_cdc_benchmark
Mar 16, 2026
Merged

oracle_cdc: add benchmark suite#4082
josephwoodward merged 1 commit intomainfrom
jw/oracledb_cdc_benchmark

Conversation

@josephwoodward
Copy link
Contributor

@josephwoodward josephwoodward commented Mar 10, 2026

Initial data set

Oracle version: Oracle Database Express Edition (XE) Release 21c running in a container.

18,000,000 rows, roughly ~3GB

SQL> SELECT segment_name, bytes / 1024 / 1024 AS mb FROM dba_segments WHERE owner 
  2* = 'TESTDB' ORDER BY bytes DESC;

SEGMENT_NAME         MB 
_______________ _______ 
USERS              2950 
SYS_C008322         288 
SQL> select count(*) from users;

   COUNT(*) 
___________ 
   18000000 

The below focuses on Streaming results as snapshotting provides a greater degree of concurrency and is far more well known between SQL based components as they'll have similar characteristics. In this instance I was seeing -140,000 msg/sec on a single table during snapshot.

LogMiner Streaming Results

Time taken: ~12 minutes
Average: ~50,000 msg/s (total average: 25,000 msg/s)
Peak: ~ 70,000 - 90,000 msg/s

Observations:

The highest throughput the connector appears to get is about 70-90,000 messages as second. Removal of the buffering improves things slightly, but not much. Proving that the limitation comes from how quick we can fetch data from LogMiner.

I performed a quick test with Debezium and its OracleCDC component that also uses LogMiner and was seeing similar results, though more thorough testing would need to be done.

Given LogMiner was never designed for the purposes of CDC and as a result has a number of performance characteristics that are always going to impede its throughput such as:

  • Requiring buffering of transactions until a COMMIT or a ROLLBACK is received before flushing
  • There can be large gaps between the redo log's SCNs before we reach data
  • Being single threaded
image image image

@claude
Copy link

claude bot commented Mar 10, 2026

Commits
LGTM

Review
This PR adds a benchmark suite under internal/impl/oracledb/bench/ with Docker management tasks, Oracle schema setup, data generation scripts, RMAN configuration, and a Redpanda Connect benchmark pipeline config. No Go code changes.

LGTM

@claude
Copy link

claude bot commented Mar 10, 2026

Commits

  1. b1b275boracle_cdc: benchmark — Message is vague and does not use imperative mood. It should describe the action taken (e.g., oracle_cdc: add benchmark data scripts or oracle_cdc: add benchmark config and SQL seed data). If this is a follow-up to 4c066a28 (oracle_cdc: add benchmark suite), consider squashing them into a single commit since they appear to be one logical change.

Review
All changes are benchmark/dev support files (Taskfile, SQL scripts, YAML config, RMAN script) under internal/impl/oracledb/bench/. No Go code changes, no pattern violations, no security issues.

LGTM

@claude
Copy link

claude bot commented Mar 10, 2026

Commits

  1. Commits b1b275b and 5019d3d both have the message oracle_cdc: benchmark — these are vague and identical. Each commit message should describe the specific change it introduces (per commit policy: flag messages that are vague). Consider squashing these into the first commit or giving each a distinct, descriptive message (e.g., oracle_cdc: add benchmark data scripts, oracle_cdc: add benchmark config).

Review
PR adds benchmark tooling (SQL scripts, Taskfile, RMAN config, YAML benchmark config) under internal/impl/oracledb/bench/. No Go code changes, no production code affected.

LGTM

@josephwoodward josephwoodward force-pushed the jw/oracledb_cdc_benchmark branch 3 times, most recently from 418b930 to 2913356 Compare March 13, 2026 11:43
@claude
Copy link

claude bot commented Mar 13, 2026

Commits
LGTM

Review
This PR adds a benchmark suite for the Oracle CDC component under internal/impl/oracledb/bench/, including a Taskfile, SQL setup/data scripts, an RMAN config, and a benchmark pipeline config. All files are local development tooling with no Go code changes.

LGTM

@claude
Copy link

claude bot commented Mar 13, 2026

Commits

  1. 4f8bb6c feat: add readmefeat is not a recognized commit prefix per the commit policy. Expected either a system name (e.g., oracle_cdc: add readme) or chore: add readme. The first commit correctly uses oracle_cdc: as the system prefix.

Review
This PR adds a benchmark suite for the Oracle CDC component — SQL scripts, Taskfile, YAML config, RMAN script, and a README. No Go code changes.

  1. Bug in create.sql: Stray semicolon inside EXECUTE IMMEDIATE string will cause ORA-00911 at runtime — see create.sql#L27-L28.

@josephwoodward josephwoodward force-pushed the jw/oracledb_cdc_benchmark branch from 4f8bb6c to 12aa42e Compare March 13, 2026 12:15
@claude
Copy link

claude bot commented Mar 13, 2026

Commits

  1. oracle_cdc: add benchmark suite — The system name oracle_cdc does not match the component directory oracledb or the component name oracledb_cdc (as used in the benchmark config). Should be oracledb_cdc: add benchmark suite or oracledb: add benchmark suite.

Review
PR adds benchmark tooling (SQL scripts, Taskfile, RMAN config, benchmark YAML, README) for the Oracle CDC component under internal/impl/oracledb/bench/. No Go code changes.

LGTM

@josephwoodward josephwoodward force-pushed the jw/oracledb_cdc_benchmark branch from 207262e to db5e37b Compare March 13, 2026 15:52
@claude
Copy link

claude bot commented Mar 13, 2026

Commits

  1. oracle_cdc: add benchmark suite — The system name oracle_cdc does not match the known area directory oracledb or the component name oracledb_cdc. Should be oracledb: add benchmark suite or oracledb(cdc): add benchmark suite.

Review
This PR adds a benchmark suite for the Oracle CDC component — SQL scripts, Taskfile, YAML config, and a README. No Go code changes.

  1. README.md L51: Documentation says products.sql inserts 500k rows, but the script actually inserts 200k (products_total := 200000).

@josephwoodward josephwoodward force-pushed the jw/oracledb_cdc_benchmark branch from db5e37b to 7e9de81 Compare March 16, 2026 11:30
@josephwoodward josephwoodward marked this pull request as ready for review March 16, 2026 11:31
@claude
Copy link

claude bot commented Mar 16, 2026

Commits
LGTM

Review
This PR adds a benchmark suite for the Oracle CDC component under internal/impl/oracledb/bench/. It includes SQL scripts for table creation and data generation, a Taskfile for orchestration, a benchmark pipeline config, an RMAN setup script, and a README. No Go code changes.

LGTM

@josephwoodward josephwoodward force-pushed the jw/oracledb_cdc_benchmark branch from 7e9de81 to 0477733 Compare March 16, 2026 12:57
@claude
Copy link

claude bot commented Mar 16, 2026

Commits
LGTM

Review
PR adds a benchmark suite for the Oracle CDC component: SQL scripts for table creation and data generation, a Taskfile for local Docker-based Oracle setup, a benchmark pipeline config, and a README with instructions. No Go code changes.

LGTM

@josephwoodward josephwoodward force-pushed the jw/oracledb_cdc_benchmark branch from 0477733 to 6952f0e Compare March 16, 2026 13:32
@claude
Copy link

claude bot commented Mar 16, 2026

Commits
LGTM

Review
PR adds a benchmark suite for the Oracle CDC component under internal/impl/oracledb/bench/ — SQL scripts for schema creation and data generation, a Taskfile for orchestration, a benchmark YAML config, and a README. No Go code changes.

LGTM

@josephwoodward josephwoodward merged commit 2f63cdf into main Mar 16, 2026
7 checks passed
@josephwoodward josephwoodward deleted the jw/oracledb_cdc_benchmark branch March 16, 2026 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants